positive item sampling and fix infinite loop #7

isipalma · 2021-06-16T01:45:14Z

Put a limit in sampling loop to prevent infinite loop
Change method for sampling. Always the positive item is in the profile

aaossa

Some questions about the code and caught a possible bug

aaossa · 2021-08-05T15:31:45Z

2 - Triplet sampling (Random).ipynb

+    "# Mark interactions used for evaluation procedure if needed\n",
+    "if \"evaluation\" not in interactions_df:\n",
+    "    print(\"\\nApply evaluation split...\")\n",
+    "    interactions_df = mark_evaluation_rows(interactions_df)\n",
+    "    # Check if new column exists and has boolean dtype\n",
+    "    assert interactions_df[\"evaluation\"].dtype.name == \"bool\"\n",
+    "    print(f\">> Interactions: {interactions_df.shape}\")\n",
+    "\n",


I forgot, why was this needed here?

I just noticed, the code was not present in this repository but it was in mine, right?

aaossa · 2021-08-05T15:33:51Z

2 - Triplet sampling (Random).ipynb

@@ -202,22 +210,27 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "def random_triplet_sampling(samples_per_user, hashes_container, desc=None):\n",
+    "def random_triplet_sampling(samples_per_user, hashes_container, desc=None, limit_iteration=10000):\n",


Why 10000? Maybe we could use the number of interaction as limit, or a proportion of said number. If I have a million records, and need to sample an important number of it, a proportion of len(interactions_df) (or interactions_df.size, not sure which one is better) would be more appropriate than a fixed number

aaossa · 2021-08-05T15:35:09Z

2 - Triplet sampling (Random).ipynb

+    "        aux_limit = limit_iteration\n",
    "        while n > 0:\n",
+    "            if aux_limit == 0:\n",
+    "                break\n",


aux_limit does not change its value, in line 247 we should use aux_limit instead of limit_iteration and that may be a fix

aaossa · 2021-08-05T15:35:34Z

2 - Triplet sampling (Random).ipynb

-    "assert len(samples_training) >= TOTAL_SAMPLES_TRAIN\n",
-    "assert len(samples_testing) >= TOTAL_SAMPLES_VALID\n",
-    "\n",


Why was this removed?

positive item sampling and fix infinite loop

34a43e7

isipalma marked this pull request as draft June 19, 2021 16:37

aaossa requested changes Aug 5, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

positive item sampling and fix infinite loop #7

positive item sampling and fix infinite loop #7

isipalma commented Jun 16, 2021

aaossa left a comment

aaossa Aug 5, 2021

aaossa Aug 5, 2021

isipalma Sep 29, 2021

aaossa Aug 5, 2021

aaossa Aug 5, 2021

aaossa Aug 5, 2021

positive item sampling and fix infinite loop #7

Are you sure you want to change the base?

positive item sampling and fix infinite loop #7

Conversation

isipalma commented Jun 16, 2021

aaossa left a comment

Choose a reason for hiding this comment

aaossa Aug 5, 2021

Choose a reason for hiding this comment

aaossa Aug 5, 2021

Choose a reason for hiding this comment

isipalma Sep 29, 2021

Choose a reason for hiding this comment

aaossa Aug 5, 2021

Choose a reason for hiding this comment

aaossa Aug 5, 2021

Choose a reason for hiding this comment

aaossa Aug 5, 2021

Choose a reason for hiding this comment